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An Expandable Local and Parallel Two-Grid 
Finite Element Scheme 
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Abstract 

An expandable local and parallel two-grid finite element scheme based on superposi¬ 
tion principle for elliptic problems is proposed and analyzed in this paper by taking 
example of Poisson equation. Compared with the usual local and parallel finite ele¬ 
ment schemes, the scheme proposed in this paper can be easily implemented in a large 
parallel computer system that has a lot of CPUs. Convergence results base on and 
a priori error estimation of the scheme are obtained, which show that the scheme 
can reach the optimal convergence orders within | Iniifp or | lniif| two-grid iterations 
if the coarse mesh size H and the fine mesh size h are properly configured in 2-D or 
3-D case, respectively. Some numerical results are presented at the end of the paper 
to support our analysis. 

Key Words two-grid finite element method, domain decomposition method, 
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1 Introduction 


Two-grid or multi-grid finite element methods and domain decomposition meth¬ 
ods are powerful tools for numerical simulation of solutions to PDEs with high 
resolution, which are otherwise inaccessible due to the limits in computational 
resources. For examples, the domain decomposition schemes, nonlinear Galerkin 
schemes and two-grid/two-level post-processing schemes in 12 13 16 


17||20] and the references therein. In the past decade, a local and parallel two- 
it method for elliptic boundary value problems was initially 
and was extended to nonlinear elliptic boundary value prob¬ 


21 


lems in 22 and Stokes and Navier-Stokes equations in 10,11 


Let us briefly recall the local and parallel two-grid finite element method in 
21 for the following simple Poisson equation with Dirichlet boundary condition 
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defined in convex domain $7 C d = 2,3: 

f—A m = /, in fl, 

\ M = 0, on dil, 

whose weak formulation is: find u G Hq{^) such that 

a{u,v) = if,v), \/vGH^{n). 

Suppose T'^(r7) is a regular coarse mesh triangulation of f7 and (ft) C 
H^{Q), <S'(f (f7) = S'^(f7) n Hq{V,) are the corresponding finite element spaces 
defined on (H.) . Let us decompose the entire domain into a series of 

_ N _ r 

disjoint subdomains, n = [j Dj. For example, see Fig. 1 


( 1 . 1 ) 

( 1 . 2 ) 


^2 

^4 






^3 


Fig. 1: Decomposition of the domain D 


If the coarse mesh standard Galerkin approximation uh G Sq {il) is obtained, 
by expanding each subdomain Dj to another subdomain flj C D and for a given 
fine mesh size h < H, one of the local and parallel two-grid schemes proposed 


21 


is: find G iSg (D^) such that 


a{ei,v) = {f,v) - a{uH,v), Vm € Sg (D^-). (1.3) 

And the final approximation is defined piecewisely by 

u^ = UH + e^h, in Dj, j = l,2,---,A. 

shows that can reach the optimal convergence 


Error estimations in 


21 


order in norm. However, it is obvious that is in general discontinuous 
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and its error bound does not in general have higher order than its error 
bound. To overcome this defect of the algorithm, the authors in 21 modified 
the above scheme to ensure the continuity of in and finally do a coarse 
grid correction to get the optimal error bound in Lp' norm. The most attractive 
feature of the algorithm is that the series of subproblems are independent once 
uh is known and therefore it is a highly parallelized algorithm. 

On the other hand, one can easily see from the content of 21 that their error 
estimates heavily depend on the usage of the superapproximation property of 
finite element spaces. Thanks to 18 , we know that the usage of this property 
makes the error constant appeared in 21 has the form where t = 

dist{dDj\dQ.,drij\d^). To guarantee the error orders obtained in 21 , one 
should demand that t = 0(1). That means the distance between the boundaries 
of a specific subdomain Dj and its expansion Oj should be of constant order. 
Therefore U,j could not be arbitrary small even when diam(Ilj) tends to zero. 
This will lead to a vast waste of parallel computing resources. 

In this paper, we follow the basic idea presented in [21| to construct another 
form of two-grid local and parallel scheme, in which the scale of each subproblem 
can be much smaller compared with that in 21 . In fact, we will deal with the 


case of diam(T>j) = 0{H) and t = 0{H). We call the scheme an expandable 
local and parallel two-grid scheme because the scale of each subproblem can be 
arbitrary small as H tends to zero and every two adjacent subproblems only 
have a small overlapping. Similarly, to get a better error bound, a coarse 
grid correction is done in each cycle of two-grid iteration. 

Different from the previously mentioned local and parallel schemes, we use 
superposition principle to generate a series of local and independent subproblems 
and this will make the global approximation continuous in D. Such kind of 
technique has been successfully used in 14 15 , in which adaptive variational 
multi-scale methods were constructed. In fact, the scheme in this paper is 
quite similar to the variational multi-scale schemes in the two references. The 
difference is the schemes presented in 14 15 are adaptive schemes based upon 
some a posterior error estimates and therefore some boundaries related problems 
have to be solved. Another contribution of this paper compared with is 

that a priori error estimate of the scheme is obtained and for patches of given 
size, our analysis show that a few iterations, say 0(|lni/p) or 0(|lni7|) in 
2-D or 3-D respectively, will generate an approximation with same accuracy as 
the fine mesh standard Galerkin approximation. In addition, following the idea 
of partition of unity method (see i)> authors in proposed a local and 

parallel two-grid scheme for second order linear elliptic equations in 2-D case 
based on the scheme presented in . Although the usage of partition of unity 
method makes the global approximation continuous, but their error estimation 
is still based on the superapproximation property of the finite element space 
and therefore the distance t, theoretically, must be constant order to guarantee 
their estimations. 

The rest of this paper is organized as follows. In the coming section, some 
preliminary materials are provided. In section 3, local and parallel scheme is 
constructed. Error estimates in both and norms are obtained for the 
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scheme in section 4. Finally, some numerical experiments are given to support 
our analysis in section 5. 


2 Preliminaries 

In this paper, for the sake of simplicity of analysis, we only consider the case 


of the Poisson equation (1.1) and we can get similar results for general linear 


elliptic problems as the Poisson equation with little modifications. 

For a bounded convex domain C R‘^, d = 2,3, we use the standard nota¬ 
tions for Sobolev spaces lF®’P(n) and their associated norms, see, e.g., and [^. 
For p = 2, we denote = lF®’^(r2) and Hq{Q) = {r; g = 0}, 

II ■ Ils,n = II • ||s. 2 ,a and | • |s_o the corresponding semi-norm. In some places 
of this paper, || • ^ should be viewed as piecewisely defined if it is necessary. 

For simplicity, the following symbols <, > and ~ will be used in this paper. In 
the rest, Xi < yi, X 2 ^ ?/2 and x^ ~ y^, mean that xi < Ciyi, X 2 > C 22/2 and 
C 3 X 3 < 2/3 < C 3 X 3 for some constants Ci, C 2 , C 3 and C 3 that are independent 
of mesh size, Xi, yi and local domains which will be introduced in the following 
sections. In the following, we denote by (•, •) the inner product on 17. Thus, 
II • llo,n = (t)® and, in we know that || • ||i,n ~ IjV • ||o,n. For simplic¬ 

ity of expression, we use || • ||n to denote || • ||o,n in the rest. For subdomains 
Si C S 2 C 17, Si CC S 2 means that dist( 95 ' 2 \ 9 f 7 , dSi\dn) > 0. 

For any given Si C 17, we denote by (•, •)sj the inner product on Si 

a{u, v)si = (Vm, , a{u, v) = (Vu, Vw)n, 


in the rest of this paper. Then we get the weak form (1.2). 
It is obvious that 


||Vu|||^ =a{u,u)s, < ||Vu||sJ|V7;||sj. 


( 2 . 1 ) 


We assume that T^(f7) = is a regular triangulation of 17. Here H = 
max Idiamfr,?)! is the mesh size parameter. Let 

S^{n) = {VH G c°(17) : VHlrg G p;„, Vrsf G T^(17)}, 

be a (7°—finite element space defined on 17 and Sq {fl) = S'^(17) n Hq (17), where 
r > 1 is a positive integer and P'’„ is the space of polynomials of degree not 

greater than r defined on Tq . Given Si C 17, which aligns with T^(17), we 
define T^{Si) and S^{Si) to be the restriction of T'^(17) and S^{il) on Si. 

For these finite element spaces and problem (1.2), we make the following 
assumptions. 


A1 Interpolant. There is a finite element interpolation Ih defined on S^ {Si) 
and we denote Ih = I — Ih such that for any w G H^{t^), 0 < m < s < 
r + 1, 
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A2. Inverse Inequality. For any w G {H), 

A3. Regularity. For any / G the solutions of 

a{u,v) = (f,v) yvGH^in), 

satisfies 

||M||2,n < ||/IU2(o). 


Now let us state the standard Galerkin equation of (1.2|: find uh G 
such that 

a{uH,VH) = if,VH), yuMGS^ifl). ( 2 . 2 ) 

And it is classical that 


||u - unWa + H\\y/{u - UH)\\n = 
if u e H^iQ) n r > l. 


(2.3) 


3 Local and Parallel Two-Grid Scheme 


Let us denote 

w = u — Uh G IlQ(fl). 

Then the residual equation is 

a(w,v) = (f,v) - a{uH,v), Vu G (3-1) 

Assume that {4>j}f=i is a partition of unity on H for given integer A > 1 such 

N N 

that n C U supp (f>j and ^ = 1 on fl. In the rest of the paper, we denote 

i=i i=i 

Dj = supp (pj and always assume that Dj aligns with T^{n). We can rewrite 
(|3.1|) as 

N N 

a{w,v) = {f,'^(j>jv) - a{uH,^(pjv), Vu G HliVl). 
j=i i=i 

By superposition principle, the above residual equation is equivalent to the 
summation of the following subproblems: 

a(w^,v) = {f,(l)jv) - a{uH,(l^jv), Mv € j = 1,2, • • • , iV. (3.2) 


N 

That is w = 'YP, w^- Each subproblem is a ”local residual” equation with homo- 

geneous Dirichlet boundary condition, which is driven by right-hand-side term 
of a very small compact support, and all the subproblems are independent once 
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uh is known. To discretize and localize the ’’local residual” equation and there¬ 
fore to reduce the computational scale, we restrict the above subproblem in a 
local domain Qj, which contains Dj and is also assumed to be aligned with 
T^(n). For each local subdomain we assume that T^{Q,j) = {tq_} is a reg¬ 
ular triangulation on it. Here h = max max {diam(rQ .)}. For simplicity, 

l<j<N ^ 

the local fine mesh T^iyij) is defined as follows throughout the rest of the pa¬ 
per. For a global regular triangulation T^(H) = {tq} on H which aligns with 
T^(H), we define For this mesh parameter h{h < i?), we 

introduce following fine mesh finite element spaces S'g (f2_,) and S^{Vl)^ 

(H), which have the same definitions as (Vl) and Sq {^1) given in the pre¬ 
vious section. Since the functions in SQ{ilj) can be extended to functions in 
S'q (H) with zero value outside flj, we regard Sg as a subspace of 5 'q (H) in 
the sense of such zero extension. Since and T^(H) align with T^{n), we 
always assume 

s^{n) c c = |J (3.3) 

^<j<N 


Now we give the approximate ’’local residual” equation as follows: find 

H,h 


w^H h ^ *^0 (^i) 


h^'^) = yv G j = 1,2,-■ ■ ,N. (3.4) 


It is clear that all subproblems in (3.41 are independent. Note that ■u)'^ ^ can be 
extended to the entire domain H with zero value outside flj in iJg(H), we still 
use ^ to denote such extension in the rest and we denote 


N 

i=i 

Now we define the following intermediate approximate solution 


UH,h = Uh + WH,h- (3.5) 

Since the approximation UH,h is obtained by solving a series of local sub¬ 
problems which are imposed with artificial homogeneous boundary conditions 
of the first kind, some local non-physical oscillation may occur. This will cer¬ 
tainly bring some bad influence to the global accuracy of the approximation. 
To diminish such influence, we choose to smooth the above intermediate ap¬ 
proximation UH.h by following coarse grid correction: find Eh G Sq (il) such 
that 

a{EH,v) = {f,v) - a{uH,h,v) Vu G S'^(H). (3.6) 

And the final approximate solution is defined as 


u% = UH,h + Eh = Uh + WH,h + Eh- 


(3.7) 
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Now for t he i mplementation of the proposed local and parallel two-grid 
scheme (3.4)-(3.7), we have to choose a proper partition of unity of 

n and its associated computational domain flj. A simple choice of the partition 
of unity is the piecewise linear Lagrange basis functions associated with the 
coarse grid triangulation T^(n), where N is the number of vertices of (ft) 
including the boundary vertices. For each vertex j of the coarse grid, let us 
denote Dj = supp (j)j. Then we expand Dj by one coarse mesh layer to get flj, 
that is 

flj = Di, 

Xi^Dj 

where Xi denotes the ith vertex of the coarse mesh triangulation. It is clear that 


diam(£>j), dist{dD ~ H. 


(3.8) 


4 Error Estimates 

By the idea of fictitious domain method (see H) , we extend the local sub¬ 
problem (3.4) to n. Let us denote F = 914 and F^ = dflj\T. If we introduce 

= S'Q(f4)|rj C and ^(bj) = {H^{Tj)y which is equipped 

with the following norm 


= sup 


/r, 


we can show that the local residual ^ C S'o (f4) satisfies the following saddle 
point problem: find (rc-^ ^ ^ ^ (bj) such that V(v, /i) S •S'g (f2) x 


>J + < >3 = - a{uH,(j)jv), 


(4.1) 


where 


< fi,v >j= / fivds V/r G ^ (Fj), u G S'g (n). 


To show the well-posedness of the above saddle point problem, let us intro¬ 
duce following two finite element spaces 

SUnj) = {v€S\n,):vUAT,=Oh 

S^in\nj) = {VG z;|o(n\a,)\r, = 0}. 

1 

For given g G (bj), we introduce two auxiliary problems 


and 


^(wl,^')n, = 0, ui\r, = g Vu G S'g (flj), 


i{u 2 ,v)n\Q. ^0, U 2 \rj = g V-c G S'g (n\f4j). 








4 Error Estimates 


These two problems define two mappings ^ and 72 ^ from (Tj) into 
and S^{yL\Vtj), respectively. That is 

Ml = li^g, U2 = -i2^g. 

And we know that 


Il7i ^gWH^cij), 1172 ^g\\H^n\nj) < \\g\\^i^^^y 

Then we can define an operator 7 “^ from H^{Tj) into SQ^fl): for any given 


7 ^5 = 


71^5, 

72'^g, mn\nj. 


And we have the following property of 7 

_ 1 

Now for any fj, G ^ (r^), we have 

IImII -1 = sup < sup < sup 

Hh Td dWn^in) «GS'^(a) l|M||_f/i(n) 


< g,,v >j 


This ensures that the saddle point problem is well-posed. And it is straight that 
^ is the solution of this global problem. 

_ Let us recall the residual equation (3.1) and the ’’local residual” equations 

(3.2). For the previously defined hne mesh T^{Vl) and the associated finite 
element space S'q (fl), their fine mesh Galerkin approximations are as follows. 
Find wh G Sq{U) and G S'o (^)) J = 1j 2, • • • such that 


a{wH,v) = {f,v) - a{uH,v), Vu G Sq (fl), 


(4.2) 


and 


m) = if, (j)jv) - a{uH, Vu G S^in), j = 1, 


2,--- .N. 


(4.3) 


And we know that wh 
If we denote 


N 

^ is the Galerkin approximation of w in S'q (fl). 
i=i 


9i 

we know that satisfies: V(n,/r) G S'o(^) ^ ^ (^i) 

a{w^fj,v)+ < ,v >j + < - gj >j= [fAjv) - a{uH,(l)jv). (4.4) 

Here, we can verify the Lagrange multiplier satisfies 

<C^m>^= 0 , Vug 5 ^( 0 ), 


(4.5) 
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if we take /i = 0 in (4.4) and compare (4.4) with (4.3). 
If we denote by 


N 


-'H. 


M = Wh- WH,h’ ^H,h = 

i=i 




the ’’local error” and the global error of UH,h respectively, then by comparing 
(4.4) with (|4.1|) and taking /i = 0, we have 


and 


>j= 0, \/v e So{n), j = 1,2, • • • ,N, 


N 

i{eH,h,v) + Y < >j= '^veSgifl). 

1=1 


(4.6) 


It is clear that for all points x S 11, there exists a positive integer k, which 
has nothing to do with N and x, such that each x belongs to k different Vlj at 
most. By using the previously defined operator 7 “^ and the fact we just stated, 
we can easily get the following lemma. 


Lemma 4.1: The multipliers^ in (4-.1) satisfies 

lie 

and 






N 


N 


E<?^^>^-^«^(Eii^ii'-l EiiHi^bn). 


1=1 


1=1 


Proof. The first estimate is quite easy if one notices the property of 7 ^ and we 
omit its proof. For the second estimate, thanks to the definition of || • || 1 , 

we have Vx G 


N 


N 


E<e\i^>.<Eii^^ 


11 . 1 , 


1=1 


1=1 




N 


N 


N 


< 




1=1 




N 


<«^(Ell^ll'-i )^ll^ll^bn)- 


1=1 


□ 


Now let us consider the estimation of each ||^||q. To do so, we notice 


that 


I Vc 


hJIq = l|V(u’ff - w^H,h)\\Q = llVwi^IlQ/n, + l|V(u)^ - w^H,h)\\nr 
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It is obvious that satisfies the following equation 

a(.e^H,h\nj,v) = 0, veS^inj), e’H,h\dn3 ='^H\dn3- 

Since G >5'o (^)) we have 

W^nh^an, = II^hII i.a(n\n,) l|Vu>^^||n/n,- 

Then we know that 


(4.7) 


Being aware of (4.7|, we give the estimate of || Ve^ ^||n, which plays a crucial 
role in this section. 


Lemma 4.2: Let us denote 


Oid 


I Ini/P ’ 
I Ini/I ’ 


d = 2, 

d = 3, 


where c > 0 is a positive constant that does not depend on H, h and Llj. Then 
we have 

Proof. To prove this lemma, we first divide the region ^j\Dj as follows. For 
example, see Fig. 


r« 








1 ' 




r 



-7: 


M-l 




Fig. 2: Division of the region Llj\Dj 
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Let us denote 

= fi° = 

We extend the domain 17° along the outward normal direction on within 

17 by a fine mesh layer to obtain 17j D 17° and denote by 17j the incremental 
annular zone, that is 17] = 17]\0°. Repeat the above procedure until we get 
where M ~ Then we obtain a series of subdomains 


0 3 

and a series of disjoint annular zones 


CC cc • • • cc n 


M 
‘3 ’ 




It is clear that 


n'; = [jn), k = 0,l,2,--- ,M. 


2=0 

In what follows, we denote 
ai7]=7]=ur], -i';=dh)/dVt and r] = 917]\7^^ 
Since 

n (qA ~ , — n Wii .q!} ^. 


=0, VvG ^^(17]), k = l,2, 


and 17] = 17]“^ U 17], we know that 


fc = 0 , 1 , 2 ,-' 
•• ,M, 




(4.8) 


^ G 5*0 (17]). 

3 3 

We define a smooth function ip G S'^(17]) such that ipl^k = 0 for fc > 1 and 

J Ij 

suppi/' = 17], ■!/)(a;) = 1 Va; G 17]“^, 0 < V'< 1 and \Vip{x)\ < h~^. 

By taking 


V = h[ipw^H) G 


in (4.8), we derive 

3 ^ ^ 

< l|Vw]j||n^||V4(V'Wff)|!n<= < ||Vw]^||ok|iV(^/jw]^)||nfc. 


(4.9) 


For ||V(^/>w)]:^)||Qfc, we have 

l|V(V'Wff)llnj < < llVw]j||nj +h~^\\w^H\\n^- (4-10) 

Now let us estimate ||wilin'"- To do so, we introduce the following polar or 
spherical coordinates (p, uj) with origin at jth vertex with respect to 2-D and 
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3-D case, respectively. Here w = wi in 2-D case and uj = (a;i,a; 2 ) in 3-D case. 
The Jacobi determinant of the transformation between the Cartesian coordinate 
(a;i, • • • ,Xd) and {p,io) is 

7 'i^d] d—lsf \ r/ ^ * d—2 

J=—- -=p d(uj), d(uj) = sin Wi. 

B(p,uji ■ ■ ■ ,u^d-i) 

In the polar or spherical coordinates, let us denote 

: p(u;) = Pj(cd), : p(cd) = Pj(cd), 0 < k < M. 

It is obvious that 

p^(cd) and H< < 1. 

For 1 <k < M and any point (p, w) € noting and H is a 

convex domain, we have 




dwi 


H 




dp 


dp\ < ( 


P, 1 


P 


-Mnf’ 


dp 




dp 


where Pd{H) = H ^ when d = 3 and dd(H) = | lniJ| when d = 2. Thus 




< 


la 

„d-l 


p '^\w^f^{p,uj)fS{uj)dpduj 




' dp 


H |2 


dp)S{uj)dpduj 


< hH^^-^ddimWVw- 


,1 11? 
HWQh- 


Since || Vw)^||Qfc < || -I- || and h 2 ij'^ 2 ^ combin¬ 


ing the above estimate with (4.9) and (4.10) admits 






J d 3 

By Young’s inequality, we have 


w^H\\h-^<h-^H<^-^dd{Hmw\ 




or 
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where c > 0 is a constant that does not depend on H, h, j and k. 

By using the last inequality successively, we get 

3 3 3 j 

> ■■■>{! + chH^-'^ra\H))^\\Vw^H\\%. 

3 

H6nc6 

3 3 

Noting M ~ ^ and |lVtc;J^||QM < simple calculation shows that 


(1 + 

This, together with (|4.7|), conclude s th e proof of the lemma. 


□ 

Following the result in Lemma 4.2 to estimate HVc^^Hq, we have to give 
some estimates of the ’’local residual” 

Lemma 4.3: Suppose the assumptions Al, A2 and AS are valid. Then for 
J = 1, 2, • • • ,N, we have 

llVw^lln < \\W{u-uh)\\d,- 


Proof. Thanks to the coercive property in (2.1), we can get from (3.2) that 

l|Vu>^||n= a(w^,u;^) = (/, 

= a{u - UH, (fjWn) = lH{(t>jWH)) 

= a{u - UH,lH{(t>jlHW^H) + Ih[4 >3 IhW^h\)- 


From the continuity property of the bilinear form a{-, •), we know 
a{u - UH,lH{(t>jlHW^H) + Ih[4 >3 IhW^h\) 

< \\W{u-UH)\\D,{miH{4>3lHWmD, + II ||,,, ). 

Thanks to Al, A2 and notice that (fj is a linear function on each and 
\Dcj3,\<H-\ 

rS<ZD, 

<H{Y1 

<H{Y^ [U3D^ilHwU\lg + m3D{lHW^H)€^^^^^ 

<iJ( ^ 
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< \\D(j)^iHW^H\\D, + UjD{iHW^H)\\D, ^ 

Combination of the above estimates yields 

llVth^lln < \\W{u-uh)\\d,- 

□ 

Now we give the error estimations of the scheme (3.4)-(3.7l in the following 
theorem. 


Theorem 4.1: Suppose that assumptions Al, A2, A3 and (2.3) hold and u G 
Then 

l|V(w - Olio <h'- + H^^\\V(u - 

\\u-u^H\\n<hT^^+H\\^{u-u^H)\\n, 


where aj, > 0 is defined in Lemma 4-2 

Proof. First of all, we introduce an if ^—orthogonal projection Ph from 
onto Sq{LI): for given w G find Prw G Sq{LI) such that 


i{v,w — Phw) = 0 , Vt;G5'o(n). 


It is classical that 


||(/ - PH)w\\n < i7||Vzc||n, Vtc G 

N 

Noticing the definition of WH,h and ^ ((j — ^ in LI, the summation of all 

i=i 

the equations of (4.1) with /i = 0 gives the equation satisfied by WH,h 

^ r 

a{wH.h,v) = {f,v) - a{uH,v)^^vds, \/v € Sq{Q). (4.11) 

Jr, 


3 = 1 ''" ^ 


Then we have 


^ r 

a{uH.h,v) = {f,v) j ^^vds, Vr G S'o (11). 

,=1 Jt, 

Furthermore, we rewrite the coarse mesh correction as 

a{EH,v) = {f,PHv) - a{uH,h,PHv), Vu G Sq{LI). 

Adding the above two equations leads to 

^ f 

a{uH,v) = {f,v) + {f,PHv) - a{uH,h,PHv)^^vds, e Sq{LI). 
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Finally, we obtain 

^ r 

aiu%,v) = if,v) - PH)vds, yv€Sl^{n), 


and 




N 


iuh-u’h,v)=Y f e{I-PH)vds, yves^in). (4.12) 

Jr, 


3=1 ^ 

Thanks to the above error equation of Uh — Lemma 


can easily get 


4.1 


4.2 


and 


4.3 


we 


||V(u,, - n^)|lo < iL“'‘||V(n - Uff)||n. 


Then we can derive the first result by using the triangle inequality. 

For the error estimate, we use the Aubin-Nitsche duality argument. 
Since A3, for Uh — G L^(Q), there exists </> G H iLo(f2) such that 


and 


a{v,(l)) = {uh-u'}^,v), VuGiLo(O), 
ll<?^l|2,n < \\uh - u^Wq. 


Taking v = Uh — and noting (4.12), we have 

W'^h - u^lla = a{uh - u%,(j3) = a{uh - u%,{I - Ph)(I))- 


Thus 


W'^h - u%\\q =a{uh - (/ - Ph)4>) ^ ||V(u;j - u^)||n||V(/ - PH)4>\\n 

<H\\V{uh - Ol!n||<^||2,o < H\\Viuh - Ol|n||u/. - <||n. 


By using triangle inequality, this estimate admits the error estimate. □ 
From the results in Theorem |4.1[ we see that one can improve the conver¬ 
gence order of both and errors of the coarse mesh standard Galerkin 
approximation uh for ad order by one two-grid iteration. And it is easy to 
verify that all the above lemmas and theorem are valid if we replace uh by u^. 
This suggests the following two-grid iteration with 


A = [0^1+0.5] 


rO(|lniJ|2), d = 2, 
\0(|lniJ|), d = 3. 


(4.13) 


(Step 0) Let fc = 0 and solve (2.2) to get uh G Sq{^) and we denote 


(Step 1) Solve the equations in (3.4| with uh = to get {w^u 
are denoted by here. Then we get by (3.5); 


= Uh] 
which 
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(Step 2) Solve (3.6) with UH,h = denote 

k+l,h _ fc+1 1 ^fc+1. 


If fc + 1 > if, stop the iteration and denote tt^ = which is the 

final approximation with optimal error. Otherwise, let k := k + 1 and goto 
(Step 1). 


Corollary 4.1: Suppose u G iiQ(O) H ii’'“'"^(0), the final approximation of 
the scheme (Step 0)^(Step 2) has the following error bounds 

l|V(tt-0||a ^/t' + ii'+i, (4.14) 

\\u-u%\\n<h-+^+H-+^. (4.15) 


It is obvious that, to get the optimal or error, we should configure H 
and h such that 

h'^H^ or h'^H^, (4.16) 

respectively. 


5 Numerical Experiments 


In this section, we give some numerical examples to verify the analysis results. 
For simplicity, in all numerical examples we consider the following piecewise 
linear finite element spaces, that is r = 1: 


S^in) = {VG c°in ): v\,H e G T^m, s^ift) = s^{n)nH^{n). 


According to (4.16), to reach the accuracy of the standard Galerkin 
approximation in S^Jf}), we choose H and h such that h ^ H^. In this case 


\V{u-u%)\\n = 0{H^). 


(5.1) 


On the other hand, to reach the accuracy of the standard Galerkin ap¬ 
proximation in ^(((fl), we choose H and h such that h ^ . With such 

configuration, we have 

||u-<||a = 0(ff^). (5.2) 


In the following, we try to verify (5.1)-(5.2) and the efficiency of the proposed 
iteration scheme (Step 0)-(Step 2) in last section by some numerical experiments. 
On the one hand, since all the subproblems in (Step 1) are independent with 
each other once the coarse mesh approximation uu is closed at hand, there 
will be no communication cost when solving them simultaneously in parallel 
computer systems. The only thing one should pay attention is the computing of 
the right hand side in (Step 1) and (Step 2). In (Step 1) we only need the coarse 
mesh information on Dj, the support of (fj. And in (Step 2), for calculating the 
second term of the right hand side of (|3.6|), we calculate it in each fine mesh 
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Fig. 3: 2-D coarse mesh {H.) Fig. 4: 2-D local fine mesh 


element, which dramatically increases the computing time compared with the 
standard coarse mesh Galerkin method. How to calculate this term efficiently 
is somehow critical to make the scheme more efficient. 

On the other hand, very large parallel computer systems that have huge 
amount of computing cores give us a possibility to deal with large scale compu¬ 
tation and get very accurate approximation. In this case, the scale of the coarse 
mesh standard Galerkin scheme and the coarse mesh correction problem could 
be very large and how to solve the coarse mesh problem could be the bottleneck 
of the entire iterative scheme. Therefore, we should use some parallel solver 
to cope with the correction step, for example the algebraic multi-grid method 
which is well known by its efficiency. In the following numerical experiment, we 
will compare the numerical performance of the proposed iterative scheme in last 
section with the fine grid standard Galerkin scheme. For the fine grid standard 
Galerkin scheme, we use pARMS (parallel Algebraic Multilevel Solver) as the 
parallel sparse solver. In the following numerical experiments, the scale of both 
the fine mesh subproblems and the coarse mesh correction problem are not very 
large, so we only use direct method for solving them, although some parallel 
sparse solvers could be applied to the coarse mesh correction step when the 
scale of the coarse mesh correction is large. And it is shown that the efhciency 
of the proposed iteration scheme is higher than the fine mesh standard Galerkin 
scheme with the previously mentioned parallel sparse solver. 

First, we consider two 2-D examples. In these two examples, the domain is 
the unit square H = (0,1) x (0,1) with a uniform triangulation = {tq}, 

see Fig. The Fig. |^is the local fine mesh defined on the jth coarse mesh 
node. 

In the first 2-D example, we consider the problem with the following analytic 
solution 

u{x, y) = I00(a:^ — 2 x^ + x‘^){y — 3y^ + 2y^). 

In this case, we can get the exact error of the numerical solution. 

In the following Tab. [^and Tab. we give some numerical results according 
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to the above configurations of H and h. And in Tab. and Tab. we give 
the CPU time comparison for the hne mesh standard Galerkin method and the 
scheme proposed in this paper with respect to getting optimal and errors, 
respectively. That is in Tab. [2]we show the CPU time used when h = and in 
Tab. I^we show the CPU time used when h = Hi. For numerical experiments 
that the true solution u is known, we define the convergence order ’’ORDERi” 
with respect to the coarse mesh size H as 


ORDERi (uapp) 



In 






p)llo,n 


In ■ 


|lnff| 

llo,Q 


|lnff| 


error order, 


error order. 


The symbol Uapp stands for certain approximation of u defined in the algorithm. 
And ’’Iteration” stands the number of iterations that are taken for obtaining 
the final approximation. 


Tab. 1: error {h = H^) 


I 

H 

8 

16 

32 

V(m — upp) o,a 

6.8371 X 10-^ 

3.4949 X 10-1 

1.7574 X 10-1 

||V(m - u;i)||o,n 

8.7995 X 10-^ 

2.2009 X 10-2 

5.5023 X 10-3 


1.6292 X 10-1 

6.7335 X 10-2 

2.3873 X 10-2 

||V(w-0||o.n 

8.8087 X 10-^ 

2.2041 X 10-2 

5.5291 X 10-3 

ORDERi 

1.69 

1.59 

1.58 

ORDERi (m^) 

1.98 

1.9968 

1.9981 

Iteration 

2 

2 

2 


Tab. 2: CPU time comparison {h = H ^ = 32) 


NP 

4 

8 

16 

CPU time (uh) 

1532.71s 

940.56s 

559.12s 

CPU time (u^) 

531.67s 

265.92s 

133.05 


The second 2-D example is defined by giving 

/ = 701og((a; + 0.1)(sin7rj/ + 1)). 


In this example, since the exact solution u is unknown, the convergence order 
of the approximate solution is calculated as 


ORDER2(uopp) 


{ min{2,1 + 
min{3, 2 + 


In 


In 


II V-^app) llo,f2 
\\nH\ 
llo,r2 

ll'iih-'^appllo.n ') 

I In if I j’ 


}, 


error order, 
error order. 
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Tab. 3: error [h = H^) 


I 

H 

25 

36 

49 

64 

M o,n 

3.3784 X 10-3 

1.6347 X 10-3 

8.8361 X 10-4 

5.1832 X 10-^ 

Ik - Uh\\o,n 

1.3597 X 10-4 

4.5545 X 10-3 

1.8145 X 10-3 

8.1066 X 10-3 

lk-u^l!o.a 

1.1498 X 10-4 

3.8173 X 10-3 

1.5086 X 10-3 

7.5306 X 10-3 

ORDERi(it^) 

3.05 

3.05 

3.05 

3.02 

Iteration 

1 

1 

2 

2 


Tab. 4: CPU time comparison {h = , H ^ = 64) 


NP 

4 

8 

16 

CPU time (uh) 

86.56s 

63.17s 

49.47s 

CPU time (u^) 

119.34s 

64.76s 

36.12s 


Here Uh is the standard Galerkin approximation in the fine mesh finite element 
space S'q (H) and the symbol Uapp stands for certain approximation of u defined 
in the algorithm. Since, for example, the error estimate of the fine mesh 
standard Galerkin approximation admits the following estimation when h = 

\\V{u-Uh)\\o,n = 0{h) = 0{H^), 

the “ORDER 2 (uapp)” calculated by the above formula equals to 2 means 

\\^iuh-Uapp)h,n = OiH^), 

therefore 

\\^iu-Uapp)\\o,n = OiH^). 

The Tab. [^and Tab. |^give the numerical results of this test problem. 


Tab. 5: g^-error {h = H^) 


I 

H 

8 

16 

32 

\\S/{uh — Mj/) o.n 

1.8407 X 10*^ 

9.7353 X 10-3 

4.9548 X 10-1 

11 V(u;i — ,,)||o,n 

3.1398 X 10-^ 

1.6695 X 10-3 

7.3657 X 10-2 

||V(m;i - M^)||o.n 

9.4723 X 10-2 

4.1034 X 10-2 

9.4765 X 10-4 

ORDER2(Mk;,) 

1.85 

1.64 

1.55 

ORDER2(m^) 

2 

2 

2 

Iteration 

1 

1 

2 


In the rest of this section, we will give two 3-D numerical examples. In these 
two examples, the domain H is the unit cube (0,1)^. The Fig. ^ and give 
some free sketches of the 3-D coarse mesh and associated local fine mesh. 
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Tab. 6: error [h = H^) 


I 

H 

25 

36 

49 

64 

~ o.n 

8.0991 X 10-3 

3.9813 X 10-3 

2.1717 X 10-3 

1.2812 X 10-* 

\\uh - WffHo.n 

2.6319 X 10-4 

4.5438 X 10-6 

1.9168 X 10-6 

1.1412 X 10-6 

ORDER2(u^^) 

3 

3 

3 

3 

Iteration 

1 

2 

2 

2 


The first 3-D example is a test problem with the following analytic solution 
u(x, y, z) = 100(a;^ — 2 x^ + x^){.y — -b 2 y^){z^ — z). 

The numerical results are given in Tabj^andj^ 



Fig. 5; 3-D coarse mesh T^(D) Fig. 6: 3-D local fine mesh 

The second example of 3-D case is a test problem driven by the following 
free term 


/ = 701og((a; -F 0.1)(sin7rt/ -b l)(z -b 0.1)(sin7r2; -b 1)), 


whose numerical results are given in Tabj^andflOl 

All the above numerical results are obtained by using the public domain 


software FreeFem-b-b 24 
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Tab. 7: error {h = H^) 


I 

H 

6 

8 

10 

12 

V(u - Mff) o,n 

2.9515 X 10-1 

2.2774 X 10-1 

1.8467 X 10-1 

1.5504 X 10-1 

||v(u - M?j)||o,D 

5.9048 X 10"^ 

3.3496 X 10-2 

2.1515 X 10-2 

1.4969 X 10-2 


8.5886 X 10-^ 

6.2379 X 10-2 

4.7159 X 10-2 

3.6729 X 10-2 

||V(zr-0||o.n 

5.9436 X 10-2 

3.3868 X 10-2 

2.1682 X 10-2 

1.5110 X 10-2 

ORDERi(m]j^) 

1.69 

1.62 

1.59 

1.58 

ORDERi(u^) 

1.89 

1.92 

1.93 

1.94 

Iteration 

2 

2 

2 

2 


Tab. 8: error [h = H^) 


I 

H 

9 

16 

25 

u — o,n 

8.6338 X 10-^ 

2.8568 X 10-3 

1.1850 X 10-3 

||u - M?i||o,n 

1.2840 X 10-3 

2.3929 X 10-4 

6.3981 X 10-3 

\\u-u%\\o,n 

1.3024 X 10-3 

2.3795 X 10-4 

6.2568 X 10-3 

ORDERi(m^) 

2.86 

2.90 

2.91 

Iteration 

2 

2 

2 


Remark From the construction of the partition of unity used in the algo¬ 
rithm, we see that the computational domain of each local subproblem is con¬ 
tained in a ball with radius ofO{H). This means that the volume of the compu¬ 
tational domain of each local subproblem tends to zero as the coarse mesh size 
FI tends to zero. In this sense, we call the algorithm given in this paper an 
expandable local and parallel two-grid finite element algorithm. 
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